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Reselection of a Genomic Upstream Open Reading Frame in Mouse 
Hepatitis Coronavirus 5'-Untranslated-Region Mutants 
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An AUG-initiated upstream open reading frame (uORF) encoding a potential polypeptide of 3 to 13 amino acids (aa) is found 
within the 5' untranslated region (UTR) of >75% of coronavirus genomes based on 38 reference strains. Potential CUG-initiated 
uORFs are also found in many strains. The AUG-initiated uORF is presumably translated following genomic 5'-end cap-depen- 
dent ribosomal scanning, but its function is unknown. Here, in a reverse-genetics study with mouse hepatitis coronavirus, the 
following were observed, (i) When the uORF AUG-initiating codon was replaced with a UAG stop codon along with a U112A 
mutation to maintain a uORF-harboring stem-loop 4 structure, an unimpaired virus with wild-type (WT) growth kinetics was 
recovered. However, reversion was found at all mutated sites within five virus passages, (ii) When the uORF was fused with 
genomic (main) ORF1 by converting three in-frame stop codons to nonstop codons, a uORF-ORFl fusion protein was made, and 
virus replicated at WT levels. However, a frameshifting G insertion at virus passage 7 established a slightly 5'-extended original 
uORF. (iii) When uAUG-eliminating deletions of 20, 30, or 51 nucleotides (nt) were made within stem-loop 4, viable but debili¬ 
tated virus was recovered. However, a C80U mutation in the first mutant and an A77G mutation in the second appeared by pas¬ 
sage 10, which generated alternate uORFs that correlated with restored WT growth kinetics. In vitro, the uORF-disrupting non¬ 
deletion mutants showed enhanced translation of the downstream ORF1 compared with the WT. These results together suggest 
that the uORF represses ORF1 translation yet plays a beneficial but nonessential role in coronavirus replication in cell culture. 


U pstream open reading frames (uORFs) are present in —40% 
of eukaryotic mRNAs (1, 2) and are found in the mRNAs of 
many viruses that infect eukaryotes (3-6). The function of the 
uORF is not known in a majority of cases, but in many mRNAs, it 
has been shown to cause repression of translation of the down¬ 
stream (main) ORF (1, 2), usually following 5'-cap-dependent 
translation of the uORF. In other cases, 5'-cap-dependent trans¬ 
lation of the uORF enhances translation of the main ORF by var¬ 
ious mechanisms (1, 2, 4, 7-11). Some plant (12) and animal 
(13-15) viruses that have a positive-strand (mRNA-like) genome 
which undergoes necessary 5'-cap-dependent translation prior to 
viral genome replication in the cytoplasm also have a (usually 
single) short uORF. It might be expected that in these cases, the 
uORF in the genome would be a regulator of not only translation 
but also virus replication and perhaps also virus-induced patho¬ 
genesis. A single AUG-initiated uORF is found in the genomes of 
arteriviruses (13, 14, 16) and most coronaviruses (17; this study), 
two families of animal positive-strand RNA viruses in the order 
Nidovirales (18). The role of the uORF in these viruses has under¬ 
gone limited study. 

Arteriviruses and coronaviruses share features with regard to 
genome structure and replication (Fig. 1A shows a schematic of 
the mouse hepatitis coronavirus [MHV] genome and subgenomic 
mRNAs [sgmRNAs]) (18). The genomes are long (—12 kb for 
arteriviruses and —30 kb for coronaviruses), single-strand mole¬ 
cules that are 5' capped and 3' polyadenylated and undergo rep¬ 
lication via a full-length minus-strand (antigenome) intermediate 
in the cytoplasm, although to date, only coronaviruses have been 
shown to encode an N 7 -methyltransferase and a 2'-0-methyl- 
transferase needed for methylated cap formation (18-24). A gua- 
nylyltransferase has not yet been characterized for either virus. 
Both arteriviruses and coronaviruses are presumed to use 5'-cap- 
dependent, 5'-terminal 40S ribosomal entry with subsequent ri¬ 
bosomal scanning for translation of the genome. Both make a 


3'-coterminal nested set of (five to nine) sgmRNAs, each of which 
has a 5'-terminal leader identical to the single-copy leader on the 
genome (16,25). It is thought that for viruses in both families, the 
leader on sgmRNAs is acquired during minus-strand synthesis 
when the templates for the sgmRNAs are made (26, 27). The 
mechanism for leader acquisition is thought to be a template 
switching of the RNA-dependent RNA polymerase (RdRp) during 
minus-strand synthesis from pentameric (arteriviruses) or hepta- 
meric (coronaviruses) donor signaling sequences at intergenic re¬ 
gions within the genome (often called the transcription regulatory 
sequence [TRS]) to an equivalent acceptor sequence near the 3' 
end of the 5' -terminal leader on the genome (26-29). With respect 
to the 5' untranslated region (UTR) and AUG-initiated uORF 
arrangement, however, arteriviruses and coronaviruses differ in 
the following ways, (i) In arteriviruses, although the genomic 5'- 
UTR length is similar to the shortest in coronaviruses (—200 to 
225 nucleotides [nt] for arteriviruses versus —200 to 500 nt for 
coronaviruses), the leader is longer (—200 nt for arteriviruses ver¬ 
sus 65 to 90 nt for coronaviruses) (16,17). (ii) In arteriviruses, the 
uORF maps within the leader, whereas in coronaviruses, the 
uORF maps just downstream of the genomic leader. As a conse- 
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FIG 1 MHV genomic 5' UTR. (A) MHV genome and subgenomic mRNAs. A uORF is found within the 5' UTR of the genome but not sgmRNAs. ORF1 is 
translated from the genome beginning at nt 210 to produce a polyprotein that is co- and posttranslationally processed into 16 replicase-related nonstructural 
proteins. The 3' nested set of sgmRNAs is translated to produce the virion structural proteins. A pseudoknot-induced — 1 frameshifting event at the ORFla/lb 
junction during translation maintains an optimal ratio of ORF 1 a and ORF lb proteins for virus replication. The filled bar at the 5' terminus of each mRNA species 
represents the common leader that is encoded only at the genomic 5' end. (B) RNA structures in the MHV genomic 5 ' UTR. Shown are stem-loops 1 through 
5 identified by bioinformatic, genetic, and physical structure analyses. Nucleotides 140 through 170 form a long-range RNA-RNA interaction with downstream 
nt 332 through 363 (not shown). The underlined heptameric sequence UCUAAAC in stem-loop 3 at the 3' terminus of the leader is the core RdRp template¬ 
switching signal that directs leader acquisition on MHV sgmRNAs. Boxes identify the uORF start codon (nt 99), the genomic ORF1 start codon (nt 210), and a 
second nearby potential alternate ORF1 start codon (nt 219) as well as three in-frame stop codons for the uORF. Positions used for deleting regions of stem-loop 
4 (nt 96 through 115,91 through 120, 80 through 130, and 75 through 138) are identified. Potential CUG-initiated translation start sites in frame with the uORF 
and ORF1 are found beginning at nt 111 and 159. 


quence, the uORF is found on the genome and on each sgmRNA 
in arteriviruses, whereas in coronaviruses, the uORF, with very 
few exceptions (30), is found only on the genome (Table 1). 

The role that the uORF plays in nidoviruses has been examined 
most closely in arteriviruses (13, 14). When the AUG start codon 
for the uORF in equine arteritis virus, which is in a suboptimal 
Kozak context for translation, was changed to an AGG nonstart 
codon by mutation in a reverse-genetics analysis, or when the 
Kozak context was made optimal, the resulting virus plaque size 
was smaller than that of the wild type (WT), and growth kinetics 
were found to be impaired (13). In this case, reselection of a uORF 
start codon in its original suboptimal context was found upon 
virus passaging in cell culture. In another similar reverse-genetics 
study with the same virus, growth impairment was not observed 
with an AUG—>AGG mutation, but reversion to a WT AUG was 
found upon virus passaging (14). These studies together would 
indicate that the uORF plays a beneficial role in arterivirus survival 
in cell culture, but the contribution of the uORF to fitness has not 
been characterized. In betacoronaviruses, features of the uORF in 
MHV were learned when the cis-acting properties of the stem- 
loop 4 structure, which harbors the uORF, were investigated by 


reverse genetics (31). In a previous study by Yang et al. (31), it was 
found that a 30-nt deletion of a distal portion of stem-loop 4 (nt 91 
through 120), which removed almost all of the uORF, surprisingly 
remained viable although mildly debilitated, whereas deletion of a 
predicted 64-nt-long version of a complete stem-loop 4 (nt 75 
through 138) was lethal. It was also shown that mutation of the 
uORF AUG to a nonstart AGG codon was detrimental to virus 
growth in cell culture. In studies described here using the same 
strain of MHV (MHV-A59), carried out largely concurrently with 
those of Yang et al. (31 ) and with some of the same mutations, we 
confirm the discovery of Yang et al. regarding the behavior of 
deleted features of stem-loop 4 but also extend these findings by 
describing the phenomenon of uORF reselection and demonstrat¬ 
ing that the deletion of a predicted 51 -nt-long shorter version of a 
complete stem-loop 4 (nt 80 through 130) is viable. 

Here, with a reverse-genetics system for MHV, three different 
experimental approaches were used to disrupt the AUG-initiated 
uORF and test for the tendency of the virus to restore an intact 
uORF, by reversion or by compensatory changes, upon passaging 
of progeny virus. In all three approaches, restoration of a uORF 
was found in most mutants within 8 to 10 passages, although the 
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uORF pgr se was not necessary for virus replication in cell culture. 
In addition, the AUG-mutated uORF (but not the AUG-deleted 
uORF) correlated with a high virus titer in cell culture, and with a 
subcloned MHV 5' -proximal sequence that was translated in vitro 
in a rabbit reticulocyte translation system, the AUG-mutated 
uORF correlated with up to a 1.6-fold-higher translation yield. 
Therefore, the AUG-initiated uORF confers some attenuation of 
translation of the downstream (main) ORF1. Inspection of the 
group-classified reference strains of coronaviruses also revealed 
potential CUG-initiated uORFs in subgroup-specific distribution 
patterns. The potential CUG-initiated uORFs are described but 
were not studied further. These results together indicate that the 
MHV genomic AUG-initiated uORF, although it represses trans¬ 
lation from ORF1, must play a beneficial role in virus survival in 
cell culture, as evidenced by uORF reselection following its dis¬ 
ruption or removal. Further studies are needed to establish the 
nature of this benefit. 

MATERIALS AND METHODS 

Virus and cells. The A59 strain of MHV (GenBank accession number 
NC_001846) was used for reverse-genetics analyses (32). Delayed brain 
tumor (DBT) cells (33), mouse L2 cells (34), and baby hamster kidney 
cells expressing the MHV receptor (BHK-MHVR) (35, 36) were grown in 
Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% 
defined fetal calf serum (FCS) (HyClone) and 20 p-g/ml gentamicin (In- 
vitrogen). Cells were maintained at 37°C with 5% C0 2 for all experiments. 
BHK-MHVR cells were maintained in selection medium containing 0.8 
mg/ml Geneticin (G418 sulfate; Invitrogen) (32). 

RNA structure prediction. The mfold program of Zuker (http://www 
.bioinfo.rpi.edu/zukerm/) (37, 38) was used for RNA structure predic¬ 
tions. 

MHV reverse-genetics system. The reverse-genetics system for 
MHV-A59, infectious clone MHV-A59-1000 (icMHV), developed and 
kindly provided by Ralph Baric and colleagues (32), was used as previ¬ 
ously described in detail for making 5’-proximal mutations in the MHV 
genome (39). Viral mutants were made by modifying fragment A (39) 
with the appropriate primers for the mutations described below. All pro¬ 
cedures for mutant plasmid construction with icMHV DNA, plasmid 
DNA ligation, synthesis of full-length mutated recombinant viral RNA, 
transfection of cells with infectious recombinant RNA by electroporation, 
and characterization of mutant progeny by virus titration and growth 
kinetics were carried out as previously described (39). Plaque morphology 
was determined on L2 cells after 60 h of growth and after crystal violet 
staining, as described previously (39). Plaque sizes were identified as large 
(WT) if they were S2.5 mm, medium if they were 1.5 to 2.5 mm, or small 
if they were < 1.5 mm in diameter. Plaque images were captured by laser 
scanning or by photography with a Nikon digital camera and prepared 
with Adobe Photoshop software. 

Genome sequence analysis of virus progeny. Routinely, supernatant 
fluids from cells that first showed cytopathic effect (CPE) (either cells that 
had been transfected or cells that had been blind passaged) were collected, 
and the harvested virus was named virus passage zero (VPO). When 80 to 
100% of new DBT cells infected with VPO virus showed CPE, intracellular 
RNA was TRIzol (Invitrogen) extracted, and the viral genome was se¬ 
quenced by reverse transcription-PCR (RT-PCR) for the 5'-proximal nt 
22 to 1093. VPO virus was then used to determine plaque morphology, and 
plaque-purified virus was used as the starting material for determining 
growth kinetics on DBT cells and sequence analyses. 

For analysis of the 5’ nt22to 1093 of progeny virus genomes, extracted 
cellular RNA was reverse transcribed with Superscript II reverse transcrip¬ 
tase (Invitrogen), using primer MHV-1094( + ) (5'-CGATCAACGTGCC 
AAGCCACAAGG-3'), which binds MHV genomic nt 1094 to 1117, and 
cDNA was PCR amplified with primers MHV-leader( —) (5’-TATAAGA 
GTGATTGGCGTCCG-3'), which binds nt 1 to21 ofthe MHV antileader, 
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and MHV-1094( + ). PCR products were gel purified (Qiaex II; Qiagen) 
prior to automated sequencing with primers MHV(261-284)( —) (5'-CC 
ATGGATGCTTCCGAACGCATCG-3') and MHV(605-623)( + ) (5'-GT 
TACACAGGCAGACGCGC-3'). 

Northern analysis. Northern analysis was done as previously de¬ 
scribed (40). Briefly, freshly confluent DBT cells in 25-cm 2 flasks (~4 X 
10 6 cells) were infected with WT or mutant viruses at a multiplicity of 
infection (MOI) of 1.0 PFU/cell. At 20 h postinfection (hpi), intracellular 
RNA was TRIzol extracted, and 1/10 of the total RNA from one 25-cm 2 
flask (—60 p,g RNA total per 25-cm 2 flask) was resolved by electrophoresis 
in a 1.0% agarose-formaldehyde gel at 150 V for 4 h. RNA was transferred 
to a HyBond N + nylon membrane (Amersham Biosciences) by vacuum 
blotting for 3 h, followed by UV cross-linking. After prehybridization of 
the membrane with NorthernMax Prehybridization/Hybridization buffer 
(Ambion) at 55°C for 4 h, the blot was probed at 55°C overnight with 20 
pmol (—4 X 10 5 cpm/pmol) of y- 32 P-5'-end-labeled 3'-UTR-specific 
oligonucleotide MHV(31094-31122)( +) (5'-CAGCAAGACATCCATTC 
TGATAGAGAGTG-3'), which binds MHV genomic nt 31094 to 31122. 
Probed blots were exposed to Kodak XAR-5 film at — 80°C for imaging, 
and images were prepared by using Adobe Photoshop software. 

Construction of plasmids for generating transcripts for in vitro 
translation. For in vitro translation analysis of a large portion of the non- 
structural protein 1 (nspl) gene containing the 5' UTR with mutations, a 
WT construct was made, which fused the 5'-proximal 899 nt of the ge¬ 
nome precisely with the 3' UTR that has an attached 65-nt poly(A) tail. 
For this, plasmid A of the cloned MHV-A59 genome containing an up¬ 
stream T7 promoter and all of the nspl coding region (32) was used to 
prepare the 5'-end fragment, and plasmid G (32) was used to prepare the 
3' -end fragment. The final cloned sequence was made by overlapping the 
two PCR fragments at the junction sites, reamplifying with primers 
T7startMHV and EcoRI-65A-MHV( + ), and cloning into the TOPO-XL 
vector (Invitrogen) between the two EcoRI sites. Plasmids with specific 
mutations were made by modifying the WT plasmid with the appropriate 
primers. Insert and junction sequences in all constructs were confirmed 
by DNA sequencing. 

In vitro transcription. To prepare RNA for in vitro translation, the 
DNA template was removed from the TOPO plasmid by EcoRI digestion 
and purified by gel electrophoresis. Capped transcripts were made with 
the T7 mMessage mMachine kit (Ambion), according to the manufactur¬ 
er’s protocol, which places the m7GpppG cap on —80% of transcripts 
(Ambion). 

In vitro translation. For in vitro translation, 100 ng of transcript was 
translated for 1 h at 30°C in a 25-p.l mixture containing 17.5 pi rabbit 
reticulocyte lysate (RRL) (Promega), 2 nM amino acid mixture minus 
methionine, 10 U RNasin RNase inhibitor (Promega), and 20 pCi of 
[ 35 S] methionine. Radiolabeled proteins were resolved by sodium dodecyl 
sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) in gels of 12% 
polyacrylamide, and dried gels were exposed to Kodak XAR-5 film for 
imaging. Bands were removed, and radioactivity was quantitated by scin¬ 
tillation counting. Radioactive counts were normalized to the number of 
methionine bases in the WT. For a loading control, 500 ng of each sample 
was resolved by agarose gel electrophoresis, the gel was stained with 
ethidium bromide, and the image was captured by Fotodyne UV26 pho¬ 
tography followed by band density quantitation using TINA version 2.0 
(Raytest, Germany). 

RESULTS 

An AUG-initiated uORF is found in the genomes of a majority of 
coronavirus species. An analysis of sequenced coronavirus ge¬ 
nomes available in GenBank showed that a uORF, similar to that 
depicted for MHV-A59 in Fig. IB, is present usually in a subopti- 
mal Kozak context in >75% of species, as represented by the 38 
reference strains (Table 1). In the betacoronavirus subgroup, 
these include bovine coronavirus (BCoV), the highly studied 
MHV, severe acute respiratory syndrome coronavirus (SARS- 


CoV), and the recently identified Middle East respiratory syn¬ 
drome coronavirus (MERS-CoV) (41). The uORF maps down¬ 
stream of the (65- to 90-nt) common leader and potentially 
encodes a peptide of 3 to 13 amino acids (aa) in length (Table 1). 
An AUG-initiated uORF is not found in bat coronavirus HKU9-1, 
a currently categorized betacoronavirus D member; in beluga 
whale virus SW1, a gammacoronavirus; or in 7 of 10 recently 
characterized deltacoronaviruses (42) (Table 1). However, in 
these virus, inspection reveals the presence of one to eight poten¬ 
tial CUG-initiated ORFs that could encode peptides of 2 to 89 aa 
(Table 2). Potential CUG-initiated uORFs are also present in most 
viruses with an AUG-initiated ORF as well, and interestingly, pat¬ 
terns of the potential CUG-initiated ORFs differ among the coro¬ 
navirus subgroups (Table 2) (see Discussion). 

It is notable that the AUG-initiated uORFs in the laboratory- 
studied betacoronaviruses MHV, BCoV, and SARS-CoV are 
found associated with a phylogenetically conserved stem-loop 4 
(15, 31). Stem-loop 4 in BCoV (formerly called stem-loop III 
[15]) has been shown to be a as-acting element in defective inter¬ 
fering (DI) RNA replication (15). However, as shown by Yang et 
al. (31), neither a functional uORF AUG codon nor a uORF-con- 
taining portion of stem-loop 4 is required for MHV replication. 
The significance of the association of the uORF with stem-loop 4 
in betacoronaviruses is not known. 

Translation of the uORF in MHV is observed when measured 
in vitro as a uORF-ORFl fusion protein. In initial experiments to 
test for a translation product from the MHV uORF that contains a 
start codon within a suboptimal Kozak context, GUGUCCAUGC 
(where the optimal sequence is GCC G/A CCAUGG, in which un¬ 
derlining identifies the —3 and +4 nucleotide positions relative to 
A in the AUG start codon [in boldface] [43]), a WT construct was 
made, in which the 5 ' 899 nt of the WT MHV - A59 genome (which 
includes the 5' UTR and 93% of the N-proximal nspl coding 
region within ORF1) was attached to the genomic 3' UTR and 
65-nt poly(A) tail. From this construct, T7-generated transcripts 
were translated in RRL, and the [ 35 S]Met-radiolabeled products 
were resolved by SDS-PAGE. Since an 8-aa peptide from the 
uORF was not discernible on a polyacrylamide gel (data not 
shown), a fusion was made between the uORF and a partial nspl 
ORF and tested for translation in RRL. For this test, the three 
in-frame sequential stop codons for the uORF (U 123 AG, U 129 GA, 
and U 138 AG) were converted to translatable codons (CAG, CGA, 
and CAG) to form a 5'-proximal sequence identical to that in 
virus mutant M3 (described below) (Fig. 2A). From this con¬ 
struct, T7 RNA polymerase-generated transcripts were made and 
translated in RRL in the presence of [ 35 S]Met. Polyacrylamide gel 
electrophoresis of the M3 translation products (Fig. 2C) revealed a 
fusion protein from the uORF (top band) and a product starting 
from nt 210 (and possibly also nt 219) (bottom band). These re¬ 
sults indicate that although there is probable leaky scanning 
through the uORF leading to synthesis of the shorter of the two 
products, the uORF does function as a translation template that 
makes the fusion protein in vitro and therefore is likely to be trans¬ 
lated in vivo as an independent uORF. 

To examine the viability of a recombinant virus containing 
these mutations, mutant M3 virus was made and tested. M3 virus 
grew within 48 h posttransfection (hpt) with recombinant RNA 
and replicated in cell culture to titers similar to those of the WT 
(Fig. 2D), and an RT-PCR test of the M3 genomic RNA sequence 
within cells at virus passage 3 revealed that it had maintained the 
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TABLE 2 Potential coronavirus CUG-initiated uORF sizes in 38 GenBank reference strains 


Virus 0 

5' UTR 
(nt) 

Potential CUG-initiated uORF and ORF1 start codons within the Kozak context* 7 

uORF peptide 
length (aa) 

GenBank accession no. 
of reference sequence 

Alphacoronavirus 





TGEV-Purdue 

314 

(None).. ,AGGAGAA 315 TGA... 


DQ811788 

FCoV 

311 

5'.. .CCGTCCC 209 TGT.. ,T 312 GA... 

34 

NC_002306 



TATTAGC 236 TGC.. ,T 257 AG.. ,AGGAGAA 312 TGA... 

7 


RhBtCoV-HKU2 

296 

5'.. ,ATCTATC 21 TGT.. .T 45 AG... 

8 

NC_009988 



CCCACGC 232 TGT.. ,T 259 AG... 

9 




GCTGTTC 251 GTT.. ,T 276 GA... 

13 




CGATAAC 2SS TGT.. ,GCACAA 297 TGT... (joins ORF 1) 

3 


HCoV-NL63 

286 

5'.. ,CTAGTGC 89 TGT.. ,TTTGTTA 101 TGG... (joins AUG-initiated uORF) 

4 

NC_005831 



TGTAAAC 143 TGG.. ,T 197 AG... 

18 c 




TAAGCAC IS0 TGG.. ,T 216 AA... 

12 c 




CCGTCAC 233 TGC.. ,T 275 AA.. ,GCTAACCA 2S7 TGT... 

14 


HCoV-229E 

292 

5'.. ,TTGATGC 105 TGG.. ,T 114 AG... 

3 C 

NC_002645 



caagtgc 161 tgt. . ,T 177 AA... 

5 




AAAGTTC 262 TGT.. ,T 328 GA.. ,TTCCTAA 293 TGG... (overlaps ORF1 start) 

23 


ScBtCoV-512 

293 

5'.. ,GTCGTGC 166 TGC.. ,T 289 AG... 

41 

NC_009657 



GAAAGTC 258 TGT.. ,t 273 ga. . ,TTAGCTA 294 TGG... 

5 


PEDV 

296 

5'.. ,gctgtgc 169 tgt. . ,t 271 ag. .. 

34 

NC_003436 



tagttcc 183 tgg. . ,t 213 ag. . ,ccggcta 297 tgg. .. 

10 


MiBtCoV-lA 

271 

5'.. ,aggtggc 195 tgc. . ,t 264 agcaggta 272 tgt. .. 

23 

NC_010437 

MiBtCoV-lB 

272 

5'.. .TTCCGTC 1SS TGT.. ,T 233 AG... 

19 

NC 010436 



AAGTGGC 196 TGC.. ,T 265 AGCAGGTA 273 TGC... 

23 


MiBtCoV-HKU 8 

268 

5'.. ,TTTAGAC 48 TGT.. ,T 69 AA... 

7 

NC_010438 



CTCGCAC 166 TGT.. ,T 205 AG... 

13 




AAACCAC 189 TGT.. ,T 249 GA.. ,GTCGCTA 269 TGG... 

20 


RoBtCoV-HKU10 

301 

5'.. ,TTCTATC 28 TGC.. ,T 52 AG... 

8 

NC_018871 



GTGGCTC 190 TGA.. ,T 250 GA... 

20 c 




TCTTGTC 281 TGA.. ,T 308 AG.. .TGCCCAA 302 TGG... (overlaps ORF1 start) 

9 


Betacoroanvirus A 





BCoV-Mebus 

210 

5'.. ,GCTTCAC 37 TGA.. ,T U3 AG... 

4 

U00735 



TCATTTC 145 TGC.. ,T 184 AG.. ,GTCACAA 211 TGT... 

13 


HCoV-OC43 

210 

5'.. ,GCTTCAC 37 TGA.. ,T 49 AG... 

4 

NC_005147 



TCATTTC 145 TGC.. ,T 184 AG.. ,GTCACAA 211 TGT... 

13 


PHEV-VW572 

210 

5'.. ,GCTTCAC 37 TGA.. ,T 49 AG... 

4 

NC_007732 



TCATTTC 145 TGC.. ,T 184 AG.. ,GTCACAA 211 TGT... 

13 


ECoV 

208 

5'.. ,GCTTCAC 37 TGA.. ,T 49 AG... 

4 

NC 010327 



tttctac 147 tgt. . ,t 183 ag. . ,GTCACAA 209 TGG... 

12 


MHV-A59 

209 

5'.. ,ATAGTGC 128 TGA.. ,T 146 GA... 

6 C 

NC_001846 



CGUUCUC 159 TGC.. ,A 210 TGG... (joins ORF1) 

17-ORF1 


MHV-JHM 

214 

5'.. ,CACTTGC 94 TGC.. ,T 151 GA... 

19 

NC_006852 



CGTTCTC 164 TGC.. ,A 215 TGG... (joins ORF1) 

17-ORF1 


RbCoV-HKU14 

208 

5'.. ,GATTC 5 TGA.. ,T 59 AA.. .GTCATAA 208 TGC... 

18 c 

NC_017083 

HCoV-HKUl 

205 

5'.. ,ATCTCTC 158 TGC.. ,t 197 ag. . ,gtcgcaa 206 tga. .. 

13 

NC_006577 

Betacoronavirus B 





SARS-CoV-Tor2 

264 

5'.. ,gtagatc 56 tgt. . ,t 86 ag. .. 

10 

NC_004718 



taaaatc 81 tgt. . ,t I53 ga. .. 

24 




GTGTAGC 81 TGT.. ,A I04 TG... (joins AUG-initiated uORF) 

5 




GCTCGGC 100 TGC.. ,T 109 AG... 

3 




ATTTTAC 146 TGT.. ,T 167 AA... 

7 




CCTCTTC 182 TGC.. ,T 233 AG... 

17 




TGCAGAC 189 TGT 261 AA.. ,GGTAAGA 265 TGG... 

24 


Betacoronavirus C 





BtCoV-133/2005 

258 

5'.. .GCCTTGC 88 TGT.. .T m AG... 

11 

NC_008315 



TGTGGTC 101 TGC.. ,T 167 AA... 

22 




TTCATTC 184 TGA.. ,T 301 AA.. .CACACCA 259 TGC... (overlaps ORF1 start) 

39 c 


TyBtCoV-HKU4 

266 

5'. . ,GCCTTGC 85 TGT.. ,T 118 AG... 

11 

NC_009019 



TGTGGTC 108 TGC.. ,T 174 AA... 

22 




TTCATTC 191 TGA.. ,T 281 AG... 

30 c 




AATACCC 231 TGT.. ,CATACTA 2S7 TGC... (joins ORF1) 

12 


PiBtCoV-HKU5 

260 

5'. . ,TGCGTGC 95 TGC.. ,T 119 AG... 

8 

NC_009020 



ACCTTTC 108 TGC.. ,A I41 TG... 

11 




acaccac 151 tgg. . ,T 172 AA... 

7 




TTAAAAC IS7 TGA.. ,T 307 AG.. .CACATCA 261 TGT... (overlaps ORF1 start) 

47 c 


MERS-CoV 

288 

5'. . ,ACTTGTC 110 TGG.. ,T 188 AA.. ,CACATCA 289 TGT... 

6 

NC_019843 

Betacoronavirus D 





RoBtCoV-HKU9 

228 

5'. . ,GTCTTGC 16 TGT.. ,T 157 AA... 

47 

NC_009021 



GTCGTCC 192 TGT.. ,T 243 GA.. ,GTAGTGA 229 TGG... (overlaps ORF1 start) 

17 



(Continued on following page) 
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TABLE 2 (Continued) 


Virus'* 

5' UTR 
(nt) 

Potential CUG-initiated uORF and ORF1 start codons within the Kozak context** 

uORF peptide 
length (aa) 

GenBank accession no. 
of reference sequence 

Gammacoronavirus 





IBV-Beaudette 

528 

5'. . ,CTACAGC 86 TGG.. ,T I19 AG... 

15 

NC 001451 



TGGCACC 136 TGG.. ,T 396 GA... 

86 




ATACATC 221 TGT.. ,T 299 AG... 

26 




GAACCTC 289 TGG.. .T 448 AG... 

53 




CAGGTTC 486 TGG.. .T 522 GACAACA 529 TGG... 

12 c 


TCoV 

528 

5'.. ,CTACAGC 86 TGG.. ,T ISI AG... 

15 

NC 010800 



AGTGCCC 117 TGG.. ,T 169 AA... 

14 c 




TGGCACC 138 TGG.. ,T 396 GA... 

86 




CAGGTTC 486 TGG.. .T 522 GACAACA 529 TGG... 

12 c 


CoV SW1 

523 

5'.. ,TGTTTCC 9S TGA.. ,T 272 AA... 

58 

NC 010646 



TGGCAGC 126 TGG.. ,T 3S0 AG... 

78 




CGGCTTC 151 TGG.. ,T 406 AA... 

24 




TTCTACC 244 TGG.. ,T 406 AA.. ,GCAAACA 524 TGT... 

54 


Deltacoronavirus 





nhcov-hkui9 

481 

5'.. .ACCATTC 115 TGA.. ,T 271 AG... 

52 c 

NC 016994 



GCCCCTC I89 TGT.. ,T 303 AG... 

38 




CCGAGCC 299 TGG.. ,T 368 GA... 

23 c 




CTCAAGC 393 TGA.. ,T 441 AG.. ,AAGAAGA 482 TGG... 

16 c 


WiCoV-HKU20 

218 

5'.. ,TCAGGAC 129 TGC.. ,T 144 AG... 

5 

NC 016995 



GGCACTC 200 TGG.. ,T 215 AG.. ,ACTAGTA 219 TGG... 

5 C 


CMCoV-HKU21 

477 

5'.. ,TACGTGC 94 TGC.. ,T 133 AA... 

13 

NC 016996 



ATTTTGC 122 TGT.. ,T 203 AG... 

27 




CGTATTC 404 TGT.. ,T 41S AA... 

4 




CCTATTC 447 TGC. . ,T 465 AA.. ,ACCA 478 TGA... 

6 


PorCoV-HKU15 

538 

5'.. ,GTGCGTC 93 TGC.. ,T 207 AG. .. 

38 

NC 016990 



GTTCCTC 254 TGA.. ,T 284 GA. .. 

10 




ACAGCAC 284 TGA.. ,T 430 AG... 

30 c 




ACCGGTC 314 TGC.. ,T 395 GA... 

27 




AGTGATC 451 TGA.. ,T 481 GA... 

10 c 




TCTGATC 456 TGG.. .T 525 GA.. ,TGTGAAA 539 TGG... 

23 c 


SpCoV-HKU17 

519 

5'.. .GGGGCGC 106 TGT.. ,T 328 AG... 

74 

NC 016992 



GATTACC 133 TGG.. ,T 254 AG... 

40 




GTTCCTC 234 TGG.. ,T 264 GA... 

10 




ACAGCAC 263 TGA.. ,T 353 AG... 

30 c 




ACCGGTC 294 TGC.. ,T 4I7 AG... 

41 




TCTGATC 436 TGG.. ,T 505 GA.. ,TGAGAAA 520 TGG... 

23 c 


MunCoV-HKU13 

594 

5'.. .CTTTGGC 116 TGA.. ,T 347 AG... 

77 

NC 011550 



TGGTCAC 132 TGC.. ,T 207 AG... 

25 




AAAGGCC 229 TGG.. ,T 268 AG... 

13 c 




AGTGATC 50S TGA.. ,T 545 AG... 

13 c 




TCTGATC 511 TGG.. ,T 580 GA... 

23 c 




GCAGCTC 573 TGT.. ,T 385 AG.. ,TTTGGAA 595 TGG... 

4 


MRC 0 V-HKUI 8 

595 

5'.. ,AACGGCC 15I TGG.. ,T I90 AG... 

13 c 

NC 016993 



GGCTCGC 161 TGG.. ,T 350 AG... 

63 




CACGGCC 229 TGG.. ,T 268 AG... 

13 c 




TCTTCTC 298 TGT.. ,T 331 AG... 

11 




GTTAAGC 3S0 TGT. . ,T 429 AG... 

23 




ACCGGTC 370 TGC.. ,T 493 AG... 

41 




AGTGATC 507 TGA.. ,T 546 AG... 

13 c 




TCTGATC 512 TGG.. ,T 581 GA.. ,TTTGAGA 59S TGG... 

23 c 


ThCoV-HKU12 

591 

5'.. ,ATTTTGC 35 TGC.. ,T 302 AA... 

89 

FJ376621 



TACTACC 2I7 TGT.. ,T 235 AG... 

6 




ATTCCTC 316 TGA.. ,T 454 AA... 

46 




AGTGACC 503 TGA.. ,T 542 AG... 

13 c 




CCTATTC 5S2 TGC.. ,T 580 AA... 

6 




AGCTGCC 572 TGA.. ,T 598 GA.. ,TCAGATA 592 TGG... (overlaps ORF1 start) 

9 


BuCoV-HKU-11 

506 

5'.. ,GTTGTGC 94 TGG.. ,T I15 AG... 

r 

FJ376619 



CAGTGCC 105 TGC.. ,T I41 AA... 

12 




TTTCGGC 168 TGT.. ,T 255 AG... 

29 




GATTGTC 179 TGT.. ,T 212 GA... 

11 




TACTTGC 339 TGA.. ,T 360 AG... 

7 




ACCGGTC 380 TGC.. ,T 497 AG... 

39 




CCTATTC 577 TGC.. ,T 595 AG... 

6 




AGCTGCC 587 TGA.. ,T 602 AGATA 607 TGG... 

5 


WEC 0 V-HKUI 6 

510 

5'.. ,ACAAAGC 8 TGA.. ,T 44 AG... 

12 c 

NC 016991 



CTTAGGC 95 TGG.. ,T 128 AG... 

\2 C 




GAACTAC 135 TGG.. .T 255 AA... 

40 




ACCGCTC 294 TGC.. ,T 408 AG... 

38 




TCTAAGC 377 TGT.. ,T 46 I AG... 

28 




GGCTCGC 491 TGG.. ,T 584 AA.. ,TTTGATA 511 TGG... (overlaps ORF1 start) 

31 



a Data from GenBank (15 August 2013). 

b An optimal Kozak context is considered to be GCC A/G CCAUGG (see the text). 

c Has a purine in the —4 and +1 positions at the ORF for this peptide, denoting a potentially “good to excellent” Kozak context for translation initiation. 
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Upstream ORF in Coronavirus 5' UTR 


K- 


uORF 


->1 


ORF 1 


h 


-> 


99 112 123 129 138 146 210 219 

WT 5' ..GUGUCC AUG CCCGCGGGCCUGGUCUUGACU UAG DGC UGA CAUUUG UAG UUCCUUGAC...UGCAUA AUG GCAAAG AUG G... 
Ml 5 ' ■■■GUGUCCUAGCCCGCGGGCCAGGUCUUGACU UAG UGC UGA CAUPUG UAG UUCCaUGAC—UGCAPA AUG GCAAAG AUG G— 
M2 5 ' ...GPGPCCACGCCCGCGGGCCPGGPCPPGACP PAG PGC PGA CAPPPG PAG PPCCPPGAC...UGCAPAAPGGCAAAGAPGG... 
M3 5 ' ...GPGPCC flOG CCCGCGGGCCPGGPCPOGACPCAGOGCCGACAPPPGCAGOPCCPPGAC-UGCAPA APG GCAAAG AOG G... 
M4 5 ' ...GPGPCCACGCCCGCGGGCCPGGPCPUGACPCAGPGCCGACAPPPGCAGPPCCPPGAC...UGCAPA APG GCAAAG APG G... 
M5 5 ' ...GPGPCCflPGCCCGCGGGCCPGGPCPPGACPPAGUGCPGACAPPPGPAGPPCCPPGAC—PGCAPAAGGGCAAAGAUGG— 
M6 5 ' ..GPGPCCAUGCCCGCGGGCCPGGUCPUGPCPCAGUGCCGACAPPPGCAGUPCCPUGAC...UGCAPAAGGGCAAAGAUGG... 


AUG 

99 


AUG 

210 


AUG 

219 


In Vivo 


Virus 

uORF 

reselected 

Nucleotide 

reversion 

When 

WT 

NA 

NA 

NA 

Ml 

Yes 

Yes, at all three sites, 
both trials 

VP5 

M2 

Yes 

Yes 

VP10 

M3 

Yes 

No, G insertion after nt 
140 made stop (UGA) at 
nt 147, creating uORF 

VP7 

M4 

No 

NoatVPIO 

NA 

M5 

No 

No at VP8 

NA 

M6 

No 

No at VP8 

NA 


C 


In Vitro 




hours postinfection 





E 

WT Ml M2 



FIG 2 Disruptive point mutations in the uORF and subsequent reselection of the uORF. (A) Description of mutations in Ml through M6. ORFs are identified 
by shading. Mutated nucleotides are identified by boldface type. Bold arrowheads identify positions of WT start codons. The naturally occurring translation start 
and stop codons are underlined. Nucleotides are numbered beginning with the genome 5' end. (B) Summary of WT and mutant recombinant virus behavior for 
Ml through M6. VP, virus passage; NA, not applicable. (C) Electrophoresis of radiolabeled proteins from in vitro (RRL) translation reactions in one represen¬ 
tative experiment. (Top) SDS-PAGE of in vitro -synthesized nspl protein or the uORF-nspl fusion protein from 100 ng of RNA transcript. Quantitation was 
determined by scintillation counting of excised bands. (Middle) Percentage of methionine-normalized counts relative to those in the WT band. (Bottom) 
Separate ethidium bromide-stained agarose gel showing electrophoretically separated RNA from 500 ng loaded per lane. (D) A single growth kinetics analysis 
where the MOI was 1.0 for the WT and Ml through M6. (E) Plaques of WT, Ml, M2, M3, M4, and M6 viruses. 


fusion genotype (not shown). However, it was not determined 
whether the replicating virus used a fused translation product or 
usedtheORFl product initiating from the site at nt 210. The surprise 
from this experiment was that the uORF-ORFl fusion virus was via¬ 
ble, and its replication was robust, judging from both plaque size and 
growth kinetics. This mutant was also surprisingly stable since the 
fused genotype remained for six passages (described below). 

None of four virus mutants with uORF-disrupting muta¬ 
tions showed debilitated growth in cell culture, yet a uORF in 
three mutants was reselected within 10 virus passages. To test 


whether translation of the uORF in the virus genome is needed for 
virus replication in cell culture, four mutants were studied. In the 
first mutant, M1, the uORF was blocked by changing its AUG start 
codon to a UAG stop codon, and a U112A mutation was also 
made to maintain a stem-loop 4 structure previously shown to be 
a ds-acting requirement for bovine coronavirus DI RNA replica¬ 
tion (15). In two separate experiment trials, starting in each case 
with freshly synthesized recombinant RNA from ligated mutated 
plasmid DNA fragments, recombinant virus was recovered from 
transfection, and when measured at the first viral passage, the 
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progeny had WT-like plaques and WT-similar growth kinetics 
(Fig. 2D and E) but the fully mutated sequence. By passage 5, it was 
found by RT-PCR sequencing analysis with RNA from infected 
cells that the three mutated sites had reverted to the WT (Fig. 2B). 
In addition, plasmid constructs of Ml were used to generate tran¬ 
scripts for in vitro translation in the same manner as described 
above for the WT and M3, and transcripts were translated in RRL. 
From Ml, as from the WT, only a single band of protein initiating 
from the ORF1 start site at nt 210 was observed (Fig. 2C, top). 
From experiments with Ml, therefore, we conclude that a separate 
uORF entity is not necessary for virus replication in cell culture 
but is nevertheless rapidly reselected within four viral passages. 
The uORF therefore may provide a survival advantage for the 
virus. 

To determine if the uORF AUG would be reselected from a 
second type of ORF-disrupting mutation, M2 was made, in which 
the genome sequence was the WT sequence except that ACG, a 
weak noncanonical start codon (44), replaced the AUG uORF 
start codon. In M2, in which ORF1 starting at nt 210 is the first 
AUG-initiating codon to be approached by a scanning ribosome 
(Fig. 2A), viable virus was recovered within 48 hpt, and both prog¬ 
eny plaques and growth kinetics were similar to those of the WT 
(Fig. 2D and E). Reversion to a WT uAUG codon in M2 was not 
observed until virus passage 10 (Fig. 2B). Conceivably, the uCUG 
at nt 111 in M2, encoding a potential peptide of 4 aa, could have 
initiated uORF translation and therefore functionally replaced the 
WT AUG-initiated uORF. However, this appears unlikely since 
there was extremely little product made of the size expected for 
the uCUG-ORFl fusion protein initiating at nt 111 in M4 (de¬ 
scribed below). By gel electrophoresis, the product size from 
the in vitro translation of M2 was the same as that from the WT 
and Ml (Fig. 2C). 

To test for reselection, a third type of mutant, M3, containing 
the uORF fused in frame with ORF1 as described above, was stud¬ 
ied. Since a separate uORF could be reselected by formation of not 
only a new AUG start codon but also a new stop codon within the 
contiguous uORF-ORFl fused region (Fig. 2A), reselection by ei¬ 
ther of these mechanisms was sought by further passaging of M3 
progeny. For this, the 5' -UTR sequence was determined in each of 
eight serial passages of progeny virus. Interestingly, at passage 7, a 
G insertion was found just after nt 140, which created a frameshift 
and a consequential UGA stop codon beginning at nt 147 that 
extended the original 8-codon uORF to 16 codons. 

To test for reselection of the uORF in a fourth mutant type, M4 
was made, in which the mutation in M2 (a uORF AUG^ACG 
conversion) was combined with the mutations in M3 (conversion 
of the three in-frame stop codons to nonstop codons) (Fig. 2A). 
Reselection of a uORF in this case would require a reversion of 
ACG to AUG or the formation of a new AUG along with a rever¬ 
sion of one of the coding sequences CAG, CGA, and CAG to a stop 
codon or the formation of a new stop codon elsewhere. M4 was 
immediately viable following RNA transfection, and the plaque 
size and virus growth kinetics were similar to those of the WT (Fig. 
2D and E). After 10 passages, there was no re-formation of a uORF 
(Fig. 2B). Regarding the question of whether or not the CUG- 
initiated short uORF in M2 is translated, synthesis of a second 
large polypeptide during M4 translation in vitro would have indi¬ 
cated that it is. As is evident from the M4 product shown in Fig. 
2C, only a very small amount of in vifro-generated fusion protein 
was made, indicating that initiation from uCUG was probably 


minimal (note the faint band immediately above the major band 
in the M4 lane). It maybe, however, that uCUG-initiated transla¬ 
tion is more robust in virus-infected cells. 

Thus, under the conditions of these experiments with Ml, M2, 
M3, and M4, it appears that a uORF is not necessary for virus 
replication in cell culture, but it may provide a survival advantage 
or degree of fitness for MHV replication that leads to its reselec¬ 
tion. 

Point mutations that disrupt the uORF cause an increased 
rate of translation from the (main) ORF1 start codon in vitro. 

Our analyses of translation initiation downstream of the uORF 
have assumed that it begins at nt 210. However, just 9 nt down¬ 
stream, beginning at nt 219, an alternate AUG is found in a good 
Kozak context, which could function as the site for translation 
initiation (Fig. 2A). To establish whether the AUG at nt 219 can 
initiate translation of ORF1, the AUG at nt 210 in WT and M3 
mutant viruses was converted to a nonstart AGG codon to create 
M5 and M6, respectively (Fig. 2A), and in vitro translation prod¬ 
ucts of these mutants were compared with those of the WT and 
Ml through M4 (Fig. 2C). As can be observed, the putative non- 
fused products of M5 and M6 are slightly smaller and in smaller 
amounts than the product beginning at the AUG at nt 210, indi¬ 
cating that there is a translation product initiating at nt 219 and 
that it is less abundant. Interestingly, viruses produced from trans¬ 
fected M5 and M6 recombinant genomes were viable and revealed 
no reselection of a uORF after eight virus passages (Fig. 2B). M6 
made WT-like plaques and had WT-like growth kinetics (Fig. 2D; 
M5 was unavailable for growth kinetic analysis). It was therefore 
concluded that the AUG at nt 210 was the bona fide start codon 
used in Ml through M4 and reflected the natural ORF1 start 
codon. 

To determine whether the uORF has an influence on the rate of 
translation from ORF1, the Ml through M6 constructs containing 
the partial nspl ORF were used to determine translatability in 
RRL relative to the WT (Fig. 2C). To quantitate the relative 
amounts of protein produced, [ 35 S]Met was used in the transla¬ 
tion reaction mixture, and protein bands identified by exposure of 
the gel to X-ray film were isolated and quantified by scintillation 
counting. As shown in Fig. 2C (top), the product from each con¬ 
struct excepting M5 and M6 appeared more abundant than the 
WT. In the case of M3 and M6, two products were made, probably 
due to initiation at the uORF to yield the fusion product and 
separate initiation at the ORF 1 start site to yield the shorter prod¬ 
uct. Radioactivity quantitation demonstrated that the level of 
translation was higher in each mutant than in the WT (100%), 
ranging from 169% in Ml to 113% in M3 (Fig. 2C, middle panel, 
bottom band). Five hundred nanograms of each transcript was 
separately analyzed by electrophoresis in a nondenaturing agarose 
gel and stained with ethidium bromide as a loading control (Fig. 
2C, bottom). Thus, the uORF has the effect of repressing transla¬ 
tion from ORF1 in vitro in RRL. 

Deletion mutations of 20,30, and 51 nt, all within stem-loop 
4 and each removing the uAUG and a large portion of the uORF, 
replicated, but only in the first two mutants did 10 passages of 
virus progeny reveal an alternate AUG-initiated uORF. To de¬ 
termine whether uORF removal would affect replication, con¬ 
structs with deletions of four different sequence lengths that in¬ 
cluded the uAUG (Fig. 3A) were tested. Consistent with the 
findings of Yang et al. (31) and also extending them, our results 
demonstrate that deletions of 20, 30, and 51 nt of stem-loop 4, 
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Stem-loop 3 


Stem-loop 4 


57 58 75 77 80 91 96 99 115 123 129 138 

WT 5'...A. . . .AAUCUAA UCUAAAC UUUAUAAACGGCACUUCCUGCGUGUCC AUG CCCGCGGGCCUGGUCUUGUC AUAG UGCUGACAUUUG UAG UG... 

J. 

MA 9 6-115 5' ..A _AAUCUAAUCUAAACUUUAUAAAUGGCACUUCCUGCGUG.CUUGUCAUAGUGCUGACAUUUGUAGUG... 

I 

MA91-120 5'...A. . . . AAUCUAA UCUAAAC UUU AUG AACGGCACUUCCU.CAUAGUGCUGACAUUUGUAGUG... 

4144 (no uORF) 

MA80-130 5' ...AAOCUAAUCUAAUCCAAACUOUAOAAA.ACAUUUG UAG UG... 

(Lethal) 

MA75-138 5'...A. . . .AAOCOAAUCUAAACOUU.AGUG... 





sgRNA6 

sgRNA7 


FIG 3 Deletion mutations and subsequent reselection of uORFs in progeny virus. (A) WT sequence positions of stem-loops 3 and 4 as noted in Fig. 1. The uORF 
is shown by shading. The heptameric RdRp template-switching signal, UCUAAAC, is underlined. In mutant virus MA96-115, the C80U transition causing a new 
uAUG in virus passage 10 is identified with a J,. In mutant virus MA91-120, the A77G transition causing a new uAUG in virus passage 10 is identified with a ! . 
In MA80-130, a 4-nt insertion, AUCU, occurs between nt 57 and 58 by virus passage 10, but no new uORF is formed by this insertion. Note that this insertion 
creates a new UCUAA element, a spontaneous phenomenon previously described for the MHV genome near this site. With mutant MA75-138, no progeny virus 
was recovered following recombinant RNA transfection. (B) Growth kinetics analyses where the MOI was 1.0 for the WT and mutants at virus passages 1 and 10. 
(C) Virus plaques at 48 hpi for WT and mutant viruses at virus passage 1. (D) Northern analysis for each replicating virus using a hybridization probe that 
identifies a 3'-end sequence. The same number of cells was used to prepare RNA for each lane. 


which includes the AUG of the uORF, and 17 nt (70%), 22 nt 
(91%), and 24 nt (100%) of the uORF, respectively, can be made 
without a loss of virus viability. Only the fourth mutant, with a 
deletion of 64 nt that extended beyond both ends of stem-loop 4 
(as depicted in Fig. IB), was lethal, as was the same deletion in the 
study by Yang et al. (31) (Fig. 3A). By mfold analysis, stem-loop 4 
becomes shortened but not otherwise distorted in mutants with 
deletions of 20 nt (MA96-115) and 30 nt (MA91-120) (Fig. IB 
and 3A and data not shown). For the three viable deletion mu¬ 
tants, WT-like plaques at virus passage 1 were found for each 
mutant (Fig. 3C), but only mutants with deletions of 20 nt 
(MA96-115) and 30 nt (MA91-120) had a reselected uORF after 
10 passages as a result of upstream C80U and A77G transitions, 
respectively (Fig. 3A), and an accompanying return to WT-like 
growth kinetics (Fig. 3B). Mutants with the two largest deletions, 
30 nt (MA91-120) and 51 nt (MA80-130), showed dramatically 
reduced RNA production, as observed by Northern analysis (Fig. 
3D). Thus, our experiments confirmed the observations ofYang et 
al. that showed that large portions of stem-loop 4 can be deleted 
without killing the virus (31) but also extended them to include 
the observations that (i) a precise deletion of stem-loop 4, i.e., nt 
80 through 130, as defined in Fig. IB and as modeled by Chen and 
Olsthoorn (45), is also not lethal or restrictive of sgmRNA synthe¬ 
sis and (ii) passaging of virus with deletions of nt 96 through 115 
and nt 91 through 120 led to reselection of a uORF. Interestingly, 
in our viable deletion mutant of nt 80 through 130, an insertion of 


4 nt, AUCU, was found between nt 57 and 58 at virus passage 10, 
which led to a new UCUAA element upstream of the leader fusion 
site for leader acquisition. A similar insertion was found by Yang 
et al. (31) and was also found to occur spontaneously in a similar 
position in WT MHV during passaging in cell culture (46). It is 
also part of a UCUAA sequence at this position in the MHV-JHM 
strain (GenBank accession number X00990) that is not present in 
the MHV-A59 strain (47). 

Thus, as with the uORF-disrupting point mutations, disrup¬ 
tion of the uORF by deletions was not necessarily lethal for the 
virus, but the uORF nevertheless, as indicated by its reappearance, 
apparently plays a beneficial role in the virus in cell culture. The 
surprise in these experiments was that the entire stem-loop 4 (nt 
80 through 130) could be deleted without killing the virus. There¬ 
fore, while stem-loop 4 was identified as a cis -acting replication 
element for BCoV DI RNA, it was not found to be similarly re¬ 
quired for the replication of the intact MHV genome (15, 31; this 
study). 

DISCUSSION 

Translation of the coronavirus genome and sgmRNAs has been 
presumed to follow cap-dependent 5'-end ribosomal entry and 
ribosomal scanning. This is based on the presence of a methylated 
cap on genomic RNAs and sgmRNAs (48), on the presence of 
virus-encoded enzymes involved in capping (19-24), and on evi¬ 
dence that cap-inhibiting drugs impair virus replication (49). The 
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role of a nearly universally found intra-5'-UTR AUG-initiated 
uORF in the coronavirus genome as a potential regulator of 5'- 
end scanning-dependent translation, however, is not known. 
Here, we have used MHV as a model coronavirus in cell culture to 
test the hypothesis that the single AUG-initiated uORF is trans¬ 
lated and thereby functions to regulate ORF1 (the main ORF) 
translation and, consequently, virus replication. The data show 
that while disruption of the uAUG codon enhances translation of 
ORF1 in vitro , the mutation has no discernible effect on virus 
replication, as measured in cell culture during a 24-h infection 
period (Fig. 2). Furthermore, only moderate effects on virus rep¬ 
lication were observed when partial or total deletions of the uORF 
were made, which might have been due to structural changes in 
the cis-acting stem-loop 4 or other structures and not translation 
of the uORF per se (Fig. 3) ( 15,31 ). The data also show that a uORF 
was reselected within 10 virus passages for each of three methods 
used to disrupt the uORF: (i) mutations within the AUG start 
codon, (ii) fusion of the uORF with the main ORF (ORF1), and 
(iii) deletion of part or all of the uORF (Fig. 2 and 3). Restoration 
of a uORF by reselection brought back a near-WT-like phenotype 
in virus that had been debilitated by partial or complete deletion of 
the uORF. Therefore, it appears that one function of the AUG- 
initiated uORF is to attenuate ORF1 translation such that it pro¬ 
vides a currently unidentified advantage for virus survival. 

A genomic AUG-initiated uORF is not found in some corona- 
viruses (Table 1). These include bat coronavirus HKU9, a group D 
betacoronavirus; beluga whale coronavirus SW1, a gammacoro- 
navirus; and wigeon coronavirus HKU20, sparrow coronavirus 
HKU17, munia coronavirus HKU13-3514, magpie-robin corona¬ 
virus HKU18, thrush coronavirus HKU12-600, bulbul coronavi¬ 
rus HKU 11-934, and white-eye coronavirus HKU16, all members 
of the deltacoronavirus subgroup (42). Since the noncanonical 
CUG initiator codon is known to function to initiate translation in 
some cases, including uORFs (2, 50-54), potential CUG-initiated 
uORFs were sought by inspection of coronavirus genomes. Inter¬ 
estingly, one or more potential CUG-initiated uORFs can be 
found in almost all coronaviruses (Table 2), but only in the delta- 
coronaviruses are the CUG codons in a good enough Kozak con¬ 
text ( —3A/Gand +4A/G) (55) for likely use, suggesting that some 
deltacoronaviruses may use a CUG-initiated uORF in place of an 
AUG-initiated uORF. The potential in-frame uCUG initiator 
codon in MHV-A59 in a good Kozak context (AUAGUGC 128 
UGA) (Table 2) appears to make only a very minor amount of 
protein via in vitro translation (discussed above as a barely percep¬ 
tible band in Fig. 2C, lane M4); however, this amount could be 
larger in vivo. 

One role that the uORF might play in the coronavirus genome 
is that of repressing ORF1 translation relative to the amount of 
translation products needed from the sgmRNAs, which (mostly) 
carry no uORF. Since during coronavirus replication, the struc¬ 
tural proteins are needed in far greater abundance than the non- 
structural replicase proteins, repression of translation from ORF1 
maybe a mechanism that keeps the relative amounts optimal. In a 
sense, this is a conceptual extension of the frameshifting regula¬ 
tory paradigm within ORF1 that maintains an optimal ratio of 
ORF la to ORF lb proteins (56,57). Another possible role might be 
that the uORF contributes to long-term virus survival in cells dur¬ 
ing persistent infection. This is suggested by the spontaneous ap¬ 
pearances of uORFs during development of persistent infections. 
In one example, a G5A spontaneous mutation developed during 


persistent infection with bovine coronavirus that formed a novel 
5'-proximal short AUG-initiated intraleader uORF (58). Because 
this uORF is in the common leader, it is also present in the 5' UTR 
of sgmRNAs, and its repressive effects would be expected for all 
viral mRNAs. In vitro translation analysis demonstrated that the 
presence of the novel uORF correlated with repression of sgm- 
RNA7 translation (58). In a second example, an A77G mutation in 
MHV was found only in the genomic 5' UTR arising during per¬ 
sistent infection in cultured cells that led to a 24-nt 5' -ward exten¬ 
sion of the natural AUG-initiated uORF (59). A mechanistic con¬ 
nection between this mutation and virus persistence, however, is 
more difficult to envision, since the A77G mutation caused an 
~2.5-fold enhancement of translation, as determined by in vitro 
measurement, and an ~3.5-fold increase in p28 (nspl) abun¬ 
dance, as determined by in vivo measurement (59). Curiously, this 
was the same spontaneous mutation that occurred in MA91-120 
(Fig. 3A) that restored a WT-like phenotype to the deletion mu¬ 
tant (Fig. 3C). 

More studies are needed to determine how the subtle effects of 
the uORF described here might be involved in the more dramatic 
translation regulatory events associated with acute coronavirus 
infection. For MHV, these include the property of robust viral 
protein synthesis at a time when there is global inhibition of host 
cell translation, presumably as a function of a subunit of eukary¬ 
otic initiation factor 2 (eIF2a) phosphorylation (60-63). eIF2a 
phosphorylation blocks formation of the 40S rRNA-GTP-eIF2a 
ternary complex required for cap-dependent initiation of transla¬ 
tion (64). Interestingly, translation of MHV mRNA appears en¬ 
hanced under these conditions, apparently as a result of an inter¬ 
action between the viral leader sequence and the viral 
nucleocapsid protein (63, 65). In SARS-CoV-infected cells, trans¬ 
lation of the viral mRNAs is favored over cellular mRNAs in part 
by an endonucleoproteolytic property of viral nsp 1, which cleaves 
the 5' -terminal sequence of cellular but not viral mRNAs (66-68) . 
In this light, the mechanisms by which uORFs regulate resistance 
to the effect of cell stress in other cellular and viral mRNAs might 
be instructive for further studies on coronavirus translation regu¬ 
lation. For example, uORF translation enhances shunting in cel¬ 
lular mRNA cIAP2 (9), in prototype foamy virus genomic RNA 
(11), and in rice tungro virus (4), in a way that enables the mRNA 
or viral RNA to escape translation inhibition. uORF-enhanced 
scanning in Ebola virus RNA (5) and hepatitis B virus RNA (6) 
also enhances translation. However, none of these special mecha¬ 
nisms for translation of coronavirus nspl have yet been described. 
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